High quality genomes produced from single MinION flow cells clarify polyploid and demographic histories of critically endangered Fraxinus (ash) species

With populations of threatened and endangered species declining worldwide, efforts are being made to generate high quality genomic records of these species before they are lost forever. Here, we demonstrate that data from single Oxford Nanopore Technologies (ONT) MinION flow cells can, even in the absence of highly accurate short DNA-read polishing, produce high quality de novo plant genome assemblies adequate for downstream analyses, such as synteny and ploidy evaluations, paleodemographic analyses, and phylogenomics. This study focuses on three North American ash tree species in the genus Fraxinus (Oleaceae) that were recently added to the International Union for Conservation of Nature (IUCN) Red List as critically endangered. Our results support a hexaploidy event at the base of the Oleaceae as well as a subsequent whole genome duplication shared by Syringa, Osmanthus, Olea, and Fraxinus. Finally, we demonstrate the use of ONT long-read sequencing data to reveal patterns in demographic history.

: Syntenic regions and frequency plots for synonymous substitution rates (Ks) between syntenic CDS pairs for each partially diploid Fraxinus assembly and itself.
(a&d) Self-vs-self syntenic dot plots for F. americana, (b&e) F. nigra, and (c&f) F. pennsylvanica.Colors in Ks histograms correspond with colors of each point in the dot plots.
Figure S4: Syntenic regions and frequency plots for synonymous substitution rates (Ks) between syntenic CDS pairs for each haploid Fraxinus assembly and itself.
Self-self syntenic dot plots for (a&d) F. americana, (b&e) F. nigra, and (c&f) F. pennsylvanica.Colors in Ks histograms correspond with colors of each point in the dot plots.Ks frequency plots for Huff et al. 1 assemblies are also included for comparison: (g) F. americana-v0.2.1, (h) F. nigra-v0.2.1, and (i) F. pennsylvanica-v1.4.Demographic curve colors represent the following genome assemblies: blue: F. americana; green: F. nigra; pink: F. pennsylvanica; yellow: F. pennsylvanica reference assembly (v1.4).PSMCR plots were generated by mapping F. americana, F. nigra, and F. pennsylvanica against their own ONT long-reads and the F. pennsylvanica reference assembly mapped against its Illumina short-reads and filtered to the same coverage as the others.All curves assume a mutation rate of 7.77e-9 and a generation time of 15 years.The bcftools multiallelic and rare-variant calling model was used and maxt was set to 10 for PSMCR.Thin black arrow points to the hump in our F. pennsylvanica's demographic curve.(a) PSMCRs created using our F. pennsylvanica assembly and ONT long reads (pink) and Illumina short reads from the F. pennsylvanica reference assembly (yellow).Each curve assumes a mutation rate of 7.77e-9 and a generation time of 15 years.(b) Same as (a), but the short-read PSMC curve has its mutation rate lowered to 5.03e-9 using the same formula used to adjust the long-read PSMCs in Figure 2a (Table S9).Effective population size (Ne) is on the y-axis and time in years is on the x-axis.The bcftools multiallelic and rare-variant calling model was used and maxt was set to 10 for PSMCR.(a) PSMCRs created using our F. pennsylvanica assembly and ONT long reads (red), Illumina short reads from the F. pennsylvanica reference assembly (light purple), and Illumina short reads from the F. pennsylvanica reference assembly that has been filtered to match the depth of coverage of the long-read mapping (dark purple).(b) PSMCs created using the F. pennsylvanica reference assembly and its own Illumina short reads (orange), its own Illumina short reads that have been filtered to match the depth of coverage of the long-read mapping (dark orange), and the ONT long reads from our F. pennsylvanica sample (grey).Effective population size (Ne) is on the y-axis and time in years is on the x-axis.The bcftools multiallelic and rare-variant calling model was used and maxt was set to 10 for PSMCR.Depth of reads mapped against corresponding primary Flye assemblies for (a) F. americana, (b) F. nigra, and (c) F. pensylvanica.Read depth is on the x-axis and frequency is on the y-axis.

Read depth
Read depth Read depth

Figure S1 :
Figure S1: Raw read length and quality distributions for Fraxinus spp..

Figure S2 :
Figure S2: Haploidy assessment of Fraxinus genome assemblies before and after running Purge Haplotigs.

Figure
FigureS3: Syntenic regions and frequency plots for synonymous substitution rates (Ks) between syntenic CDS pairs for each partially diploid Fraxinus assembly and itself.

Figure S6 :
Figure S6: One-to-one macrosynteny relationships among members of the Oleeae tribe.

Figure S9 .
Figure S9.Syntenic dot plot and fractionation bias between F. americana and J. sambac.

Figure S14 :
Figure S14: Syntenic regions and frequency plots for synonymous substitution rates (Ks) between syntenic CDS pairs for each haploid Fraxinus assembly and Vitis vinifera.

Figure S15 :
FigureS15: Syntenic regions and frequency plots for synonymous substitution rates (Ks) between syntenic CDS pairs for each haploid Fraxinus assembly and the F. pennsylvanica reference assembly.

Table S3 :
Repetitive element statistics for Fraxinus spp.primary Flye assemblies

Table S5 :
1epetitive element statistics for Fraxinus spp.haploid assemblies compared with Fraxinus spp.assemblies generated in Huff et al.1

Table S6 :
1agTag Stats for scaffolding long-read haploid Fraxinus assemblies using the F. pennsylvanica reference assembly from Huff et al.1

Table S7 :
1ssembly and annotation Stats for Fraxinus assemblies generated from RagTag.F.pennsylvanica v1.4.1 is the chromosome-level assembly by Huff et al.1with the annotation generated in this study.

Table S8 :
Plant material and voucher collection details